Search CORE

19 research outputs found

Learning by stochastic serializations

Author: Armand Stephane
Kalousis Alexandros
Marchand-Maillet Stephane
Strasser Pablo
Publication venue
Publication date: 01/01/2019
Field of study

Complex structures are typical in machine learning. Tailoring learning algorithms for every structure requires an effort that may be saved by defining a generic learning procedure adaptive to any complex structure. In this paper, we propose to map any complex structure onto a generic form, called serialization, over which we can apply any sequence-based density estimator. We then show how to transfer the learned density back onto the space of original structures. To expose the learning procedure to the structural particularities of the original structures, we take care that the serializations reflect accurately the structures' properties. Enumerating all serializations is infeasible. We propose an effective way to sample representative serializations from the complete set of serializations which preserves the statistics of the complete set. Our method is competitive or better than state of the art learning algorithms that have been specifically designed for given structures. In addition, since the serialization involves sampling from a combinatorial process it provides considerable protection from overfitting, which we clearly demonstrate on a number of experiments.Comment: Submission to NeurIPS 201

arXiv.org e-Print Archive

Archive ouverte UNIGE

Two-Stage Metric Learning

Author: Kalousis Alexandros
Marchand-Maillet Stephane
Sha Fei
Sun Ke
Wang Jun
Publication venue
Publication date: 12/05/2014
Field of study

In this paper, we present a novel two-stage metric learning algorithm. We first map each learning instance to a probability distribution by computing its similarities to a set of fixed anchor points. Then, we define the distance in the input data space as the Fisher information distance on the associated statistical manifold. This induces in the input data space a new family of distance metric with unique properties. Unlike kernelized metric learning, we do not require the similarity measure to be positive semi-definite. Moreover, it can also be interpreted as a local metric learning algorithm with well defined distance approximation. We evaluate its performance on a number of datasets. It outperforms significantly other metric learning methods and SVM.Comment: Accepted for publication in ICML 201

arXiv.org e-Print Archive

Hes-so: ArODES Open Archive (University of Applied Sciences and Arts Western Switzerland / Haute école spécialisée de Suisse occidentale / FH Westschweiz)

RERO DOC Digital Library

Learning Interpretable Microscopic Features of Tumor by Multi-task Adversarial CNNs Improves Generalization

Author: Andrearczyk Vincent
Graziani Mara
Marchand-Maillet Stephane
Muller Henning
Otalora Sebastian
Publication venue
Publication date: 13/07/2022
Field of study

Adopting Convolutional Neural Networks (CNNs) in the daily routine of primary diagnosis requires not only near-perfect precision, but also a sufficient degree of generalization to data acquisition shifts and transparency. Existing CNN models act as black boxes, not ensuring to the physicians that important diagnostic features are used by the model. Building on top of successfully existing techniques such as multi-task learning, domain adversarial training and concept-based interpretability, this paper addresses the challenge of introducing diagnostic factors in the training objectives. Here we show that our architecture, by learning end-to-end an uncertainty-based weighting combination of multi-task and adversarial losses, is encouraged to focus on pathology features such as density and pleomorphism of nuclei, e.g. variations in size and appearance, while discarding misleading features such as staining differences. Our results on breast lymph node tissue show significantly improved generalization in the detection of tumorous tissue, with best average AUC 0.89 (0.01) against the baseline AUC 0.86 (0.005). By applying the interpretability technique of linearly probing intermediate representations, we also demonstrate that interpretable pathology features such as nuclei density are learned by the proposed CNN architecture, confirming the increased transparency of this model. This result is a starting point towards building interpretable multi-task architectures that are robust to data heterogeneity. Our code is available at https://bit.ly/356yQ2u.Comment: 21 pages, 4 figure

arXiv.org e-Print Archive

MAAYA: Multimedia Methods to Support Maya Epigraphic Analysis

Author: Can Gulcan
Gatica-Perez Daniel
Hu Rui
Marchand-Maillet Stephane
Odobez Jean-Marc
Pallan Gayol Carlos
Roman-Rangel Edgar
Publication venue: INAH-RedTDPC
Publication date: 19/06/2017
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Multimedia Analysis and Access of Ancient Maya Epigraphy

Author: Can Gulcan
Gatica-Perez Daniel
Hu Rui
Krempel Guido
Marchand-Maillet Stephane
Odobez Jean-Marc
Pallan Gayol Carlos
Spotak Jakub
Vail Gabrielle
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/03/2015
Field of study

This article presents an integrated framework for multimedia access and analysis of ancient Maya epigraphic resources, which is developed as an interdisciplinary effort involving epigraphers (someone who deciphers ancient inscriptions) and computer scientists. Our work includes several contributions: a definition of consistent conventions to generate high-quality representations of Maya hieroglyphs from the three most valuable ancient codices, which currently reside in European museums and institutions; a digital repository system for glyph annotation and management; as well as automatic glyph retrieval and classification methods. We study the combination of statistical Maya language models and shape representation within a hieroglyph retrieval system, the impact of applying language models extracted from different hieroglyphic resources on various data types, and the effect of shape representation choices for glyph classification. A novel Maya hieroglyph data set is given, which can be used for shape analysis benchmarks, and also to study the ancient Maya writing system

Infoscience - École polytechnique fédérale de Lausanne

Pseudo two-dimensional Hidden Markov Models for Face Detection in Colour Images

Author: Bernard Merialdo
Stephane Marchand-Maillet
Publication venue
Publication date
Field of study

This paper introduces the use of Hidden Markov Models (HMM) as an alternative to techniques classically used for face detection. Our aim is to locate faces in colour images of a videosequence in view to indexing. The use of HMM in pattern recognition is first briefly reviewed and the mapping of these models onto our problem is presented. Pseudo two-dimensional HMM arepresented and shown to be efficient and wellsuitedtools for performing facedetection in a context wherenoconstraints on face orientation are given. Issues about efficient facemodelling are discussed and illustrated with practical examples

CiteSeerX

Distance Transformation for Effective Dimension Reduction of High-Dimensional Data

Author: Eniko Szekely
Eric Bruno
Stephane Marchand-Maillet
Publication venue
Publication date: 01/01/2009
Field of study

In this paper we address the problem of high-dimensionality for data that lies on complex manifolds. In high-dimensional spaces, distances between the nearest and farthest neighbour tend to become equal. This behaviour hardens data analysis, such as clustering. We show that distance transformation can be used in an effective way to obtain an embedding space of lower-dimensionality than the original space and that increases the quality of data analysis. The new method, called High-Dimensional Multimodal Embedding (HDME) is compared with known state-of-the-art methods operating in high-dimensional spaces and shown to be effective both in terms of retrieval and clustering on real world data

CiteSeerX

Archive ouverte UNIGE

Information fusion in multimedia information retrieval

Author: Eric Bruno
Jana Kludas
Stephane Marchand-Maillet
Publication venue
Publication date: 01/01/2007
Field of study

In retrieval, indexing and classification of multimedia data an efficient information fusion of the different modalities is essential for the system’s overall performance. Since information fusion, its influence factors and performance improvement boundaries have been lively discussed in the last years in different research communities, we will review their latest findings. They most importantly point out that exploiting the feature’s and modality’s dependencies will yield to maximal performance. In data analysis and fusion tests with annotated image collections this is undermined

CiteSeerX

Archive ouverte UNIGE